The section explains the Hive tables in Loyalty, including details such as schema and how dynamic attributes are stored and queried.
Introduction to Hive in Loyalty
The Marigold Engagement Data Platform (Loyalty) is a world class data management platform that collects, unifies, and comprehends data in the context of the moment.
Loyalty is based on prescriptive, person centric data model, built especially for marketing use cases. The platform uses modern technologies and one among them is Hive, used for data storage. Hive is highly scalable and has good usability features due to the familiar relational or SQL interface, though it is not built on relational tables. In our platform, Hive is mainly used to query data across a collection of people, especially for segmentation, analytics and machine learning. For example, to compile an audience list to send out an email. The main Hive tables in the platform include:
- Member
- Activities
- Events
- Campaign
There are prebuilt capabilities in the platform, to generate the queries for these tables to execute processes such as for segmentation and analytics.
Tables
Member Table
The member table contains all the member data that is defined in the People Attributes in the Definition tab.
Please note that the term “dynamic attribute” means that the attribute is customer defined and NOT a “System” attribute. |
Member Table Columns
Column |
Type |
Description |
Example |
member_id |
STRING |
The primary identifier of the MEMBER table. |
M-000000001 |
|
STRING |
The email address of the member. |
foo@bar.com |
first_name |
STRING |
The member’s first name. |
John |
last_name |
STRING |
The member’s last name. |
Doe |
mailing_street |
STRING |
The member’s mailing street address. |
|
mailing_city |
STRING |
The member’s mailing city address. |
Oklahoma City |
mailing_state |
STRING |
The member’s mailing state address. |
Oklahoma |
mailing_postal_code |
STRING |
The member’s mailing postal code or zip code. |
73101 |
mailing_country |
STRING |
The member’s mailing country. |
US |
birthdate |
DATE |
The member’s birth date. |
1990-05-11 |
mobile_phone |
STRING |
The member’s mobile phone |
+14012342122 |
member_since |
TIMESTAMP |
The timestamp of the member’s sign-up. |
2020-01-01 01:01:01.000 |
receive_email_offers |
BOOLEAN |
A flag used for filtering out members that will be skipped for email offers. |
true or false |
receive_sms_offers |
BOOLEAN |
A flag used for filtering out members that will be skipped for sms offers. |
true or false |
receive_mobile_app_offers |
BOOLEAN |
A flag used for filtering out members that will be skipped for mobile app offers. |
true or false |
receive_mail_offers |
BOOLEAN |
A flag used for filtering out members that will be skipped for mail offers. |
true or false |
receive_e_statements |
BOOLEAN |
A flag used for filtering out members that will be skipped for electronic statements. |
true or false |
current_tier |
STRING |
The member’s current tier. |
Silver |
expiration_date |
TIMESTAMP |
The expiration date of the member’s tier. |
2020-01-01 01:01:01.000 |
tier_in_progression |
STRING |
The projected tier once the current tier expires. |
Gold |
suspend_email |
BOOLEAN |
A flag used for filtering out email addresses that will be skipped for emails. |
true or false |
suspend_email_date |
TIMESTAMP |
The timestamp when the suspend_email attribute was flagged to true. |
2020-01-01 01:01:01.000 |
suspend_email_cause |
STRING |
The reason why the suspend_email flag was set to true. |
email bounced |
referrer_id |
STRING |
The member_id that referred this member to sign-up in the customer’s Loyalty/Loyalty platform. |
M-000000002 |
last_active_at |
TIMESTAMP |
The timestamp when the member was last active. |
2020-01-01 01:01:01.000 |
deactivated |
BOOLEAN |
A flag if the member is deactivated. |
true or false |
integration_id |
STRING |
Reference to the unique identifier of the customer’s member database. |
12312312-15678 |
gender |
STRING |
The member’s gender. |
Male or Female; M or F |
marital_status |
STRING |
The member’s marital status. |
SINGLE |
education |
STRING |
The member’s education details. |
|
income |
STRING |
The member’s income details. |
|
employment |
STRING |
The member’s employment details. |
|
ethnicity |
STRING |
The member’s ethnicity. |
|
test_member |
BOOLEAN |
A flag if the member is just a test member. |
true or false |
visitor |
BOOLEAN |
A flag if the member is just a visitor, meaning the member has not signed up yet.. |
true or false |
suspend_redemption |
BOOLEAN |
A flag if the member redemption should be prevented. |
true or false |
card_id |
STRING |
The member’s card ID, it can be different to the member’s member_id. |
1233123-12312315647 |
receive_newsfeed_like_notification |
BOOLEAN |
true or false |
|
receive_newsfeed_comment_notification |
BOOLEAN |
true or false |
|
created_at |
TIMESTAMP |
The timestamp when the member was created on Marigold’s platform. |
2020-04-04 01:01:01 |
has_children |
BOOLEAN |
true or false |
|
sl_segments |
ARRAY<STRING> |
The list of Golden Record Segments, the member belongs to. |
[“segment1”, “segment2”] |
child_member_ids |
ARRAY<STRING> |
The list of child accounts for the member account. |
[“M-00000009”, “M-000000011”] |
bigint_attr |
MAP<STRING,BIGINT> |
This column contains the dynamic attributes of a member with type BIGINT. |
{ “dyn_bigint” : 10} |
decimal_attr |
MAP<STRING,DECIMAL> |
This column contains the dynamic attributes of a member with type DECIMAL. |
{ “dyn_decimal” : 10.1} |
timestamp_attr |
MAP<STRING,TIMESTAMP> |
This column contains the dynamic attributes of a member with type TIMESTAMP/DATETIME. |
{ “dyn_ts” : “2020-04-04 01:01:01.000”} |
date_attr |
MAP<STRING,DATE> |
This column contains the dynamic attributes of a member with type DATE. |
{ “dyn_date” : “2020-04-04”} |
string_attr |
MAP<STRING,STRING> |
This column contains the dynamic attributes of a member with type STRING. |
{ “dyn_str” : “a customer defined value”} |
boolean_attr |
MAP<STRING,BOOLEAN> |
This column contains the dynamic attributes of a member with type BOOLEAN. |
{ “dyn_bool” : true} |
list_of_string_attr |
MAP<STRING,ARRAY<STRING>> |
This column contains the dynamic attributes of a member with type LIST OF STRING. |
{ “dyn_list_of_str” :[‘a’, ‘b’, ‘c’]} |
all_time_activities |
MAP<STRING,BIGINT> |
This column contains the all time count per activity type for a member. |
{“purchase” : 10, “redemption” : 1} |
all_time_metrics_earned |
MAP<STRING,DECIMAL> |
This column contains the all time earned values for each metric. |
{ “point”: 100, “spend” : 50} |
all_time_metrics_redeemed |
MAP<STRING,DECIMAL> |
This column contains the all time redeemed values for each metric. |
{ “point”: 100, “spend” : 50} |
all_time_metrics_expired |
MAP<STRING,DECIMAL> |
This column contains the all time expired values for each metric. |
{ “point”: 100, “spend” : 50} |
sl_business_units |
ARRAY<STRING> |
The list of business units the member belongs to. |
[“us”, “uk”] |
parent_integration_id |
STRING |
||
person_type |
STRING |
||
updated_at |
TIMESTAMP |
The timestamp when the member profile was updated. |
2020-01-01 01:01:01.000 |
synced_at |
TIMESTAMP |
The timestamp when the member was synchronized to the Hive MEMBER table. |
2020-01-01 01:01:01.000 |
Writing Queries Against the Member Table
Basic Queries
SELECT member_id, first_name, last_name, email
FROM MEMBER_TABLE
LIMIT 1
Querying Dynamic Attributes
Marigold’s query engine allows the dynamic attributes to be abstracted to query writers. For example, we have a dynamic attribute named: a_boolean_attribute. Query writers can just write it this way:
SELECT count(1)
FROM MEMBER_TABLE
WHERE a_boolea
n_attribute = true
Since it was a dynamic attribute, the query will automatically transformed by Marigold’s query engine to:
SELECT count(1)
FROM MEMBER_TABLE
WHERE boolean_attr[“a_boolean_attribute”] = true
Again, querying for dynamic columns and static columns are no different in Marigold’s query engine.
Activity Table
The activity table contains all the data related to the activities in the system that is defined in the Activity Types in the Definition tab. Activities are user interactions like purchases, reward redemptions, challenge responses, email delivery information, profile and preference updates, etc.
Please note that the term “dynamic attribute” means that the attribute is customer defined and NOT a “System” attribute. |
Activity Table Columns
Column |
Type |
Description |
Example |
sl_ext_id |
STRING |
The external identifier of the activity. |
12312312-1aufasdelerwt |
sl_activity_ts |
TIMESTAMP |
The timestamp of the activity. |
2020-01-01 01:01:01.000 |
sl_member_id |
STRING |
The member_id in which the activity is connected to. This is NULL for anonymous activities. This can be joined with the member_id column on the MEMBER table. |
M-000000001 |
sl_type |
STRING |
The activity type. |
sl_purchase |
sl_context |
STRING |
The context of the activity. |
|
sl_id |
STRING |
Internal unique identifier of the activity per member. |
1 |
sl_subtype |
STRING |
The subtype of the activity. |
pending |
sl_label |
STRING |
The activity label is displayed on the Marigold Marketing Console. |
Campaign Offer #441 |
sl_parent_id |
BIGINT |
The row id of the parent object (when applicable) |
12345 |
sl_parent_name |
STRING |
The internal name of the parent object (when applicable) |
obj_internal_name |
sl_metric |
STRING |
The metric relevant to the activity. This can be null. |
point |
sl_value |
DECIMAL |
The metric value for the activity. Depending on the configured earn rules, this can be relevant or not. If we want to query the earned metrics for the activity, refer to the earned_metrics column. |
11 |
sl_integration_id |
STRING |
Unique identifier of the activity. |
|
sl_location |
STRING |
The location of the activity. It could be a postal ID, depending on the customer’s program. |
|
sl_comment |
STRING |
Some special notes or comments to the activity. |
This was issued due to some error. |
sl_processed_ts |
TIMESTAMP |
The mutable timestamp when the activity was last processed. If the activity was reprocessed, this value would be updated. |
2020-01-01 01:01:01.000 |
sl_received_ts |
TIMESTAMP |
The timestamp when the activity was received by the Marigold system. |
2020-01-01 01:01:01.000 |
earn_type |
STRING |
This could either be “earned”, “redeemed”, or “expired”. |
earned |
sl_tz |
STRING |
The timezone of the activity timestamp. |
UTC |
sl_offset |
BIGINT |
The timezone offset relative to GMT or UTC. |
-7 |
sl_first_processed_ts |
TIMESTAMP |
The immutable timestamp when the activity was first processed. If the activity was reprocessed, this value would NOT change. |
2020-01-01 01:01:01.000 |
bigint_attr |
MAP<STRING, BIGINT> |
This column contains the dynamic attributes of an activity with type BIGINT. |
{ “dyn_bigint” : 10} |
decimal_attr |
MAP<STRING, DECIMAL> |
This column contains the dynamic attributes of an activity with type DECIMAL. |
{ “dyn_decimal” : 10.1} |
timestamp_attr |
MAP<STRING, TIMESTAMP> |
This column contains the dynamic attributes of an activity with type TIMESTAMP/DATETIME. |
{ “dyn_ts” : “2020-04-04 01:01:01”} |
date_attr |
MAP<STRING, DATE> |
This column contains the dynamic attributes of an activity with type DATE. |
{ “dyn_date” : “2020-04-04”} |
string_attr |
MAP<STRING, STRING> |
This column contains the dynamic attributes of an activity with type STRING. |
{ “dyn_str” : “a customer defined value”} |
boolean_attr |
MAP<STRING, BOOLEAN> |
This column contains the dynamic attributes of an activity with type BOOLEAN. |
{ “dyn_bool” : true} |
list_of_string_attr |
MAP<STRING, ARRAY<STRING>> |
This column contains the dynamic attributes of an activity with type LIST OF STRING. |
{ “dyn_list_of_str” :[‘a’, ‘b’, ‘c’]} |
earned_metrics |
MAP<STRING, DECIMAL> |
The earned metric from the activity. This could contain multiple metrics. |
{“point” : 1 , “spend” : 5.0} |
metrics_earned_snapshot |
MAP<STRING, DECIMAL> |
The earned metrics snapshot before the activity was processed. This can be used for computing the accruals per activity. |
{“point” : 1 , “spend” : 5.0} |
metrics_redeemed_snapshot |
MAP<STRING, DECIMAL> |
The redeemed metrics snapshot before the activity was processed. This can be used for computing the accruals per activity. |
{“point” : 1 , “spend” : 5.0} |
metrics_expired_snapshot |
MAP<STRING, DECIMAL> |
The expired metrics snapshot before the activity was processed. This can be used for computing the accruals per activity. |
{“point” : 1 , “spend” : 5.0} |
Writing Queries Against the Activity Table
Basic Queries
SELECT sl_activity_ts, sl_type, sl_member_id
FROM ACTIVITY_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUE
Querying Dynamic Attributes.
Marigold’s query engine allows the dynamic attributes to be abstracted to query writers. For example, we have a dynamic attribute named: a_boolean_attribute. Query writers can just write it this way:
SELECT sl_activity_ts, sl_type, a_boolean_attribute
FROM ACTIVITY_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUE
Since it was a dynamic attribute, the query will automatically transformed by Marigold’s query engine to:
SELECT sl_activity_ts, sl_type, boolean_attr[“a_boolean_attribute”]
FROM ACTIVITY_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUE
Again, querying for dynamic columns and static columns are no different in Marigold’s query engine.
When doing ad hoc queries, please always reduce the query to a certain period so queries will be optimal. Always use in_period UDF if possible. See this. |
Activity Item Table
The activity item table contains all the line item data in the activity table. Each item in the activity table from the column “sl_items”, is written here on its own row.
Activity Table Columns
Column |
Type |
Description |
Example |
sl_ext_id |
STRING |
The external identifier of the activity. |
12312312-1aufasdelerwt |
sl_activity_ts |
TIMESTAMP |
The timestamp of the activity. |
2020-01-01 01:01:01.000 |
sl_member_id |
STRING |
The member_id in which the activity is connected to. This is NULL for anonymous activities. This can be joined with the member_id column on the MEMBER table and also the sl_member_id column on the ACTIVITY table. |
M-000000001 |
sl_type |
STRING |
The activity type. |
sl_purchase |
sl_item_id |
BIGINT |
A sequential count on the item number. n line items would mean 1 to n. |
1 |
sl_item_name |
STRING |
The item name. |
sofa |
sl_item_category |
STRING |
The line item category. |
furniture |
sl_id |
STRING |
Internal unique identifier of the activity per member. |
1 |
sl_context |
STRING |
The context of the activity. |
|
sl_subtype |
STRING |
The subtype of the activity. |
pending |
sl_item_quantity |
DECIMAL |
The line items quantity, if it’s a purchase activity. |
10 |
sl_item_price |
DECIMAL |
The line items price, if it’s a purchase activity. |
15.1 |
sl_location |
STRING |
The activity’s location. |
New York |
sl_tz |
STRING |
The timezone of the activity timestamp. |
UTC |
sl_offset |
BIGINT |
The timezone offset relative to GMT or UTC. |
-7 |
bigint_attr |
MAP<STRING, BIGINT> |
This column contains the dynamic attributes of an activity with type BIGINT. |
{ “dyn_bigint” : 10} |
decimal_attr |
MAP<STRING, DECIMAL> |
This column contains the dynamic attributes of an activity with type DECIMAL. |
{ “dyn_decimal” : 10.1} |
timestamp_attr |
MAP<STRING, TIMESTAMP> |
This column contains the dynamic attributes of an activity with type TIMESTAMP/DATETIME. |
{ “dyn_ts” : “2020-04-04 01:01:01”} |
date_attr |
MAP<STRING, DATE> |
This column contains the dynamic attributes of an activity with type DATE. |
{ “dyn_date” : “2020-04-04”} |
string_attr |
MAP<STRING, STRING> |
This column contains the dynamic attributes of an activity with type STRING. |
{ “dyn_str” : “a customer defined value”} |
boolean_attr |
MAP<STRING, BOOLEAN> |
This column contains the dynamic attributes of an activity with type BOOLEAN. |
{ “dyn_bool” : true} |
list_of_string_attr |
MAP<STRING, ARRAY<STRING>> |
This column contains the dynamic attributes of an activity with type LIST OF STRING. |
{ “dyn_list_of_str” :[‘a’, ‘b’, ‘c’]} |
Writing Queries Against the Activity Item Table
Basic Queries
SELECT sl_activity_ts, sl_type, sl_member_id
FROM ACTIVITY_ITEM_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUE
Querying Dynamic Attributes
Marigold’s query engine allows the dynamic attributes to be abstracted to query writers. For example, we have a dynamic attribute named: a_boolean_attribute. Query writers can just write it this way:
SELECT sl_activity_ts, sl_type, a_boolean_attribute
FROM ACTIVITY_ITEM_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUE
Since it was a dynamic attribute, the query will automatically transformed by Marigold’s query engine to:
SELECT sl_activity_ts, sl_type, boolean_attr[“a_boolean_attribute”]
FROM ACTIVITY_ITEM_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUE
Again, querying for dynamic columns and static columns are no different in Marigold’s query engine.
When doing ad hoc queries, please always reduce the query to a certain period so queries will be optimal. Always use in_period UDF if possible. See this. |
Event Table
The event table is used to store information about Events. An event is conceptually similar to an Activity in that it represents some action or behavior that the consumer takes, and that the platform is able to capture and track. An Event is essentially a lightweight, high-volume version of an Activity that's used mainly for reporting and auditing purposes.
Event Table Columns
Column |
Type |
Description |
Example |
environment_id |
STRING |
The customer identifier. |
100 |
organization_id |
BIGINT |
Sub Customer Identifier. |
100383 |
member_id |
STRING |
The member_id in which the event is connected to. This can be joined with the member_id column on the MEMBER table. |
M-000010101 |
event_id |
STRING |
The event’s identifier. Though this is NOT totally unique. |
1 |
integration_id |
STRING |
Event Identifier to identify a “send” record. |
8227927:100:1752025474 |
event_ts |
TIMESTAMP |
The timestamp of the event occurrence. |
2020-01-01 01:01:01.000 |
event_type |
STRING |
The event type. |
open, click, send, etc. |
local_tz |
STRING |
Timezone ID for the event. |
UTC |
processed_at |
TIMESTAMP |
The timestamp when the event was processed and written into Hive. |
2020-01-01 01:01:01.000 |
ext_parent_id |
STRING |
The event’s external parent ID. For example, a click event could have the same |
|
camp_id |
BIGINT |
The campaign identifier. |
17271 |
camp_name |
STRING |
The campaign name. |
20200416-FPS-FiletLob |
camp_version |
STRING |
The campaign identifier. |
17271 |
channel_type_name |
STRING |
Channel used by the campaign. |
Email Message |
campaign_type_name |
STRING |
Campaign type used. Either one-time, date triggered or event triggered. |
Regular (one-time) |
content_body_type_name |
STRING |
Content type the event is clicked on. Either HTML or Text. This is only applicable for click events. |
HTML |
content_id |
BIGINT |
Content Id the event is clicked on. This is only applicable for click events. |
237323 |
content_name |
STRING |
Content Name the event is clicked on. This is only applicable for click events. |
MahiFamilyBundle_CONTENT |
content_type_name |
STRING |
Content type the event is clicked on. Either HTML or Text. This is only applicable for click events. |
HTML |
user_agent_raw |
STRING |
User Agent String for the open or click event. This is only applicable for click and open events. |
Mozilla/5.0 (iPhone; CPU iPhone OS 13_3_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Mobile/15E148 |
ip_address |
STRING |
IP address for the open or click event. This is only applicable for click and open events. |
10.24.71.6 |
click_link_url |
STRING |
URL Link of the click event. This is only applicable for click events. |
https://google.com |
click_link_name |
STRING |
URL Link Name of the click event. This is only applicable for click events. |
|
click_link_tags |
STRING |
Link tags applied to the link. This is only applicable for click events. |
["20PercentOff", "3Course", "BloodyMaryMiniBar", "in-store", "Reservations", "ThrowbackThursday"] |
category_name |
STRING |
Category applied to the link. This is only applicable for click events. |
web |
web_event_submission_id |
BIGINT |
The web submission identifier to the web event it refers to. Only applicable for web events. Only applicable for Messaging app. |
1001 |
web_event_id |
BIGINT |
Web event identifier to the web event it refers to. Only applicable for web events. |
1 |
web_event_url_referrer |
STRING |
URL the web event refers from. Only applicable for web events. |
https://infini.com/pysearch?q=best+ev er+bread+pudding+5+stars&wunv=2 |
web_event_post_id |
BIGINT |
The post processing identifier. Only applicable for web events. Only applicable for Messaging app. |
|
bounce_category_hard_bounce |
BOOLEAN |
Is this bounce event “hard bounce”. Only applicable for bounce events. |
TRUE/FALSE |
source |
STRING |
The source application that creates this event |
RTX |
source_json |
STRING |
||
params_number |
MAP<STRING, DECIMAL> |
List of number parameters for the event. |
{ "bounce_type_id" : 2500, "base_cont_id" : 182813 } |
params_string |
MAP<STRING, STRING> |
List of string parameters for the event. |
{"Bus_Unit" : "search", "Device" : "MOBILE", "Type" : "search"} |
params_date |
MAP<STRING, TIMESTAMP> |
List of date parameters for the event. |
|
params_json |
STRING |
List of all parameters for the event. All data types applied. A Stringified JSON. |
"{""bounce_type_id" " : 2500, ""base_cont_id"" : 182813, ""Bus_Unit"" : ""search"", ""Device"" : ""MOBILE"", ""Type"" : ""search""}" |
geolocation_latitude |
STRING |
The latitude where the event occurred. |
37.4043 |
geolocation_longitude |
STRING |
The longitude where the event occurred. |
-122.0748 |
place_id |
STRING |
The place identifier where the event occurred. |
|
region_id |
STRING |
The region identifier where the event occurred. |
|
geolocation_city |
STRING |
The city where the event occurred. |
Mountain View |
geolocation_postal_code |
STRING |
The postal code where the event occurred. |
94043 |
geolocation_dma |
STRING |
The designated market area where the event occurred. |
Little Rock-Pine Bluff AR |
geolocation_phone_area_code |
STRING |
The area code where the event occurred. |
650 |
geolocation |
STRING |
The geolocation of the event. Combination of the latitude and longitude. |
(37.4043, -122.0748) |
geolocation_region |
STRING |
The region where the event occurred. |
California |
geolocation_country |
STRING |
The country where the event occurred. |
US |
event_subtype |
STRING |
The event subtype. |
|
send_time |
STRING |
Timestamp when the event is sent. For send events, event_ts is the same as send_time. |
2020-01-01 01:01:01.000 |
business_unit |
STRING |
The business unit name the event is sent from. It refers to a sub customer name. |
AFG |
time_partition |
STRING |
Index field for the table. Time partition record the year and month. YYYY-MM. |
2020-04 |
Writing Queries Against the EVENT Table
Basic Queries
SELECT event_ts, event_type, member_id
FROM EVENT_TABLE
WHERE in_period(sl_activity_ts, “last1m”) = TRUEAND event_type IN
(“open”, “click”)
JOINS with the MEMBER Table
SELECT event_ts, event_type, member_id
FROM EVENT_TABLE e
INNER JOIN MEMBER_TABLE m
ON m.member_id = e.member_id
WHERE in_period(event_ts, “last1m”) = TRUEAND event_type
IN (“open”, “click”)
When doing ad hoc queries, please always reduce the query to a certain period and provide the event_type so queries will be optimal. Always use in_period UDF if possible. See this. |
Campaign Table
The campaign table is used to store information about Messaging Campaigns.
Campaign Table Columns
Column |
Type |
Description |
Example |
organization_id |
BIGINT |
The customer identifier. |
100 |
environment_id |
STRING |
Sub Customer Identifier. |
100383 |
campaign_id |
BIGINT |
The campaign identifier. |
17271 |
campaign_version |
STRING |
The campaign version. |
1 |
campaign_name |
STRING |
The campaign name. |
20200416-FPS-FiletLob |
campaign_type |
STRING |
Campaign type used. Either one-time, date triggered or event triggered. |
Regular (one-time) |
content_type |
STRING |
Content type the event is clicked on. Either HTML or Text. This is only applicable for click events. |
HTML |
channel |
STRING |
Channel used by the campaign. |
Email Message |
description |
STRING |
The campaign details/description. |
Drive foot traffic into stores with the incentive of vanishing quantity. |
category |
STRING |
The campaign category. |
sports |
subcategory |
STRING |
The campaign subcategory. |
nl.sports.fr |
offer_type |
STRING |
The campaign offer type. |
loyalty points |
message_type |
STRING |
The campaign message type. |
reminder |
tags |
STRING |
Campaign tags applied to the campaign. |
["national", "nascar"] |
params_number |
MAP<STRING, DECIMAL> |
List of number parameters for the event. |
{ "bounce_type_id" : 2500, "base_cont_id" : 182813 } |
params_string |
MAP<STRING, STRING> |
List of string parameters for the event. |
{"Bus_Unit" : "search", "Device" : "MOBILE", "Type" : "search"} |
params_date |
MAP<STRING, TIMESTAMP> |
List of date parameters for the event. |
{ "some_date" : "2020-10-11 00:00:00.111"} |
params_json |
STRING |
List of all parameters for the event. All data types applied. A Stringified JSON. |
"{""bounce_type_id" " : 2500, ""base_cont_id"" : 182813, ""Bus_Unit"" : ""search"", ""Device"" : ""MOBILE"", ""Type"" : ""search""}" |
meta_data_number |
MAP<STRING, DECIMAL> |
List of number metadata for the campaign. |
{ "price" : null } |
meta_data_string |
MAP<STRING, STRING> |
List of string metadata for the campaign |
{ "category_subtype":"Manual ", "utm_campaign_name":"Halloween- Fallback", "message_type":"light-personalized ", "category" :"National", "primary_message" : "Holiday ", "utm_medium" : "email-instore" } |
meta_data_date |
MAP<STRING, TIMESTAMP> |
List of date metadata for the campaign |
{ "some_date" : "2020-10-11 00:00:00.111"} |
meta_data_json |
STRING |
List of all metadata for the campaign. All data types are applied. A stringified JSON. |
"{ ""price"": null, ""category_subtype"":""Manual "", ""utm_campaign_name"":""Halloween- Fallback"", ""message_type"":""light-personalized "", ""category"":""National"", ""primary_message"":""Holiday "", ""utm_medium"":""email-instore"" }" |
send_start_ts |
TIMESTAMP |
The timestamp wihen the campaign is launched. |
2020-04-16 20:27:44.844 |
send_end_ts |
TIMESTAMP |
The timestamp when the campaign ends. |
2020-04-16 20:27:44.844 |
send_tz |
STRING |
The timezone identifier for the campaign when it was launched. |
Etc/UTC |
Writing Queries Against the Campaign Table
Basic Queries
SELECT campaign_id, campaign_name, campaign_version
FROM CAMPAIGN_TABLE
JOINS with the EVENT Table
SELECT event_ts, event_type, member_id
FROM EVENT_TABLE e
INNER JOIN CAMPAIGN_TABLE c
ON c.campaign_id = e.camp_id AND
c.campaign_version = e.camp_version AND
campaign_name = e.camp_name
WHERE in_period(event_ts, “last1m”) = TRUEAND event_type
IN (“open”, “click”)
Marigold Hive User-Defined Functions
in_period()
Checks if the given date is within the evaluated period based on the given expression.
Parameters
- Date - the date in question
- String - time period expression that will be converted to a start date and end date (see Time Period Expressions)
Return
- Return type is Boolean
- Returns true if the date in question is within the start and end date, false otherwise.
Usage
- in_period(<timestamp>,’<expression>’)
Example
SELECTcount(*)
FROMACTIVITY_TABLE
WHEREin_period(sl_activity_ts,'last1m')=TRUE
|
age()
Computes the age of the given date with respect to the current date.
Parameters
- Date - the date whose age will be evaluated
- String - Possible values: year, month, day, hour. Defaults to year if given value is invalid
Return
- Return type is Integer
- Returns the age of the given date with respect to the specified time unit.
Usage
- age(<timestamp>,’<year|month|day|hour>’)
Example
SELECTcount(*)
FROMMEMBER_TABLE
WHEREage(member_since,'month')< 1
get_age_group()
Computes for the age group of the given date
For a configurable grouping(custom thresholds and labels), see section iv.
Parameters
- Date - the date whose age group (in years) will be evaluated
Return
- Return type is string
- Returns the age group of the given date, in years
- The age grouping is currently static:
- <15
- 15 - 24
- 25 - 34
- 35 - 44
- 45 - 54
- 55 - 64
- 65+
- Other
Usage
- get_age_group(<timestamp>)
Example
SELECTget_age_group(birthdate)AS"Age Group",
count(*)
FROMMEMBER_TABLE
GROUPBY"Age Group"
get_age_group() - extended
Computes for the age group of the given date with a configurable grouping and group labels.
Parameters
- Date - the date whose age group will be evaluated
- String - Possible values: year, month, day. Defaults to year if given value is invalid
- Decimal Array - Contains the threshold values (exclusive) in order from lowest to highest
- String Array - Contains the group label ordered corresponding to the same ordering in the third parameter. This will always have one element more than the size of the array in the third parameter. This last label will be used if calculated age is greater than the last value on the array in the third parameter.
Return
- Return type is string
- Returns the corresponding age group label of the given date
Usage
- get_age_group(<timestamp>,’<year|month|day>’, array(int), array(string))
Example
SELECTget_age_group(birthdate,
'year',
array(15, 25, 35, 45, 55, 65),
array('<15', '15-24', '25-34', '35-44', '45-54', '55-64', '65+'))
AS `Age Group`, count(*)
FROM MEMBER_TABLE
GROUP BYget_age_group(birthdate,
'year',
array(15, 25, 35, 45, 55, 65),
array('<15', '15-24', '25-34', '35-44', '45-54', '55-64', '65+'))
get_engagement_level()
Matches the given integer/count to an engagement level/grouping.
Parameters
- Integer - Activity count
Return
- Return type is string
- Return the engagement level/grouping
- The engagement grouping is currently static:
- 0
- 1 - 4
- 5 - 9
- 10 - 14
- 15+
Usage
- get_engagement_level(<activity count>)
Example
SELECTcurrent_tierAS"Tier",
get_engagement_level(count(*))
FROMMEMBER_TABLE
GROUPBY"Tier"
sl_sign()
Based on the earn_type, a positive or negative multiplier is returned
Parameters
- String - the earn type
Return
- Returns type is integer
- Returns 1 if earn type is “earn”, -1 if “expire” or “redeem”, and 0 otherwise
Usage
- sl_sign(earn_type) or sl_sign(‘<earn type>’)
Example
SELECTSUM(sl_sign(earn_type)*%PrimaryMetric)AS"Balance"
FROMACTIVITY_TABLE
get_region()
Get the region based on the mailing state.
Parameters
- String - the mailing state or state
Return
- Return type is string
- Returns the region of the given mailing state, currently only US regions are supported
- The regions are currently static:
- West
- Southwest
- Southeast
- Midwest
- Northeast
Usage
- get_region(mailing_state) or get_region(‘<state>’)
Example
SELECTget_region(mailing_state)AS"Region",
count(*)
FROMMEMBER_TABLE
GROUPBY"Region"
get_all_time_balance()
Takes a decimal array with length of 3 as parameter. It computes the all-time balance while assuming that the first element of the array is the all-time earn value, the second element is the all-time redeem value, and the third element is the all-time expire value.
Parameters
- Decimal Array - the array to compute the all-time balance from
Return
- Return type is a decimal value
- Returns array[0] - array[1] - array[2]
Usage
- get_all_time_balance(<metric>)
Example
SELECTSUM(COALESCE(
get_all_time_balance(
array(all_time_metrics_earned["point"],
all_time_metrics_redeemed["point"],
all_time_metrics_expired["point"]))
,0))AS"Balance"
FROMMEMBER_TABLE
format_timestamp_string()
Parameters
- String - Takes a timestamp formatted string to be formatted to: yyyy-MM-dd'T'HH:mm:ss.SSS
Return
- yyyy-MM-dd'T'HH:mm:ss.SSS formatted string, if timestamp format is unrecognized the input string is returned.
Usage
- format_timestamp_string('2015-04-11T10:30:45.123+08:00')
Example
SELECTcast(format_timestamp_string(member_since) astimestamp) as member_since
FROM MEMBER_TABLE
get_number_group()
Computes for the grouping of the given number column(can be decimal type) with a configurable grouping and group labels. This is similar to get_age_group except that this one is for numbers and accepts only 3 parameters.
Parameters
- Number - the number column that will be evaluated against the ranges.
- Decimal Array - Contains the threshold values (exclusive) in order from lowest to highest
- String Array - Contains the group label ordered corresponding to the same ordering in the third parameter. This will always have one element more than the size of the array in the third parameter. This last label will be used if the number is greater than the last value on the array in the third parameter.
Return
- Return type is string
- Returns the group label where the number column should belong to
Usage
- get_number_group(<number>, array(int), array(string))
Example
SELECT get_number_group(total_score, array(100, 200, 300, 400, 500),
array('Regular', 'Premium', 'Silver', 'Gold', 'Diamond', 'Elite')) AS`Membership Category`
FROM MEMBER_TABLE
sum_array()
Takes an array of numbers as a parameter. It computes the sum of the elements of the given array.
Parameters
- Number Array
Return
- Return type is a decimal value
- Returns the sum of the elements of the given array (e.g. array[0] + ... + array[n])
Usage
- sum_array(<array>)
Example
SELECT member_id,
sum_array(map_values(all_time_activities))
FROMMEMBERTABLE
same_array()
Takes 2 arrays (of primitive elements) and an optional boolean as parameters. Checks if the provided arrays are equal given that order of the elements will be ignored or not.
Parameters
- Array/List of Primitive Elements
- Array/List of Primitive Elements
- Boolean
- True: ignore order of elements
- False: includes order of elements
- Null/No 3rd parameter: this value defaults to False
Return
- Return type is a boolean value
- Returns True if the two arrays contain the same elements with order of elements ignored/included, otherwise, returns False
- If the 1st or 2nd parameter (or both) is null, the returned value defaults to False
Usage
- same_array(<array>, <array>, [optional] <boolean>)
Example
SELECT * FROMMEMBERTABLE
WHERE same_array(list_of_string_attr["browsing_preference"],
array("TV Series"), true) = TRUE
is_subset()
Takes 2 arrays (of primitive elements) as parameters. Checks whether the first array parameter is a subset of the second array parameter.
Parameters
- Array/List of Primitive Elements - array to be check if it is a subset of second array
- Array/List of Primitive Elements - reference array where the first array will be checked
Return
- Return type is a boolean value
- Returns True if the first array is a subset of second array, otherwise, returns False
- If the 1st or 2nd parameter (or both) is null, the returned value defaults to False
Usage
- is_subset(<arrayToCheck>, <referenceArray>)
Example
SELECT * FROMMEMBERTABLE
WHERE is_subset(array("TV Series", "Anime Series"),
list_of_string_attr["browsing_preference"]) = true
Time Period Expressions
- MM/dd/yyyy-MM/dd/yyyy: Period between the first date and the second date, inclusive
- MM/dd/yyyy or MM/dd/yyyy-: Period since first date
- alltime: all time
- ytd: Year to date
- mtd: Month to date
- wtd: Week to date
- last**<n>**d: last n days; e.g. last5d
- last**<n>**w: last n weeks; e.g. last3w
- last**<n>**m: last n months; e.g. last2m
- prev**<n>**d: previous n days; e.g. prev5d. The 5 days before the last 5 days. Used for period-to-period comparison.
- prev**<n>**w: previous n weeks; e.g. prev5w. The 5 weeks before the last 5 weeks. Used for period-to-period comparison.
- prev**<n>**d: previous n months; e.g. prev5m. The 5 months before the last 5 months. Used for period-to-period comparison.
All times are 12am midnight. Date is in UTC.
Native Hive Functions
See this guide.